Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis.

Identifieur interne : 001D13 ( Main/Exploration ); précédent : 001D12; suivant : 001D14

Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis.

Auteurs : Marie-Dominique Devignes [France] ; Sidahmed Benabderrahmane ; Malika Smaïl-Tabbone ; Amedeo Napoli ; Olivier Poch

Source :

RBID : pubmed:23013652

Descripteurs français

English descriptors

Abstract

Functional classification aims at grouping genes according to their molecular function or the biological process they participate in. Evaluating the validity of such unsupervised gene classification remains a challenge given the variety of distance measures and classification algorithms that can be used. We evaluate here functional classification of genes with the help of reference sets: KEGG (Kyoto Encyclopaedia of Genes and Genomes) pathways and Pfam clans. These sets represent ground truth for any distance based on GO (Gene Ontology) biological process and molecular function annotations respectively. Overlaps between clusters and reference sets are estimated by the F-score method. We test our previously described IntelliGO semantic distance with hierarchical and fuzzy C-means clustering and we compare results with the state-of-the-art DAVID (Database for Annotation Visualisation and Integrated Discovery) functional classification method. Finally, study of best matching clusters to reference sets leads us to propose a set-difference method for discovering missing information.

DOI: 10.1504/IJCBDD.2012.049207
PubMed: 23013652


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis.</title>
<author>
<name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<affiliation wicri:level="1">
<nlm:affiliation>Lorraine University, Equipe Orpailleur, Campus Scientifique, Vandoeuvre les Nancy cedex, France. devignes@loria.fr</nlm:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>Lorraine University, Equipe Orpailleur, Campus Scientifique, Vandoeuvre les Nancy cedex</wicri:regionArea>
<wicri:noRegion>Vandoeuvre les Nancy cedex</wicri:noRegion>
<wicri:noRegion>Vandoeuvre les Nancy cedex</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Benabderrahmane, Sidahmed" sort="Benabderrahmane, Sidahmed" uniqKey="Benabderrahmane S" first="Sidahmed" last="Benabderrahmane">Sidahmed Benabderrahmane</name>
</author>
<author>
<name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
</author>
<author>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
</author>
<author>
<name sortKey="Poch, Olivier" sort="Poch, Olivier" uniqKey="Poch O" first="Olivier" last="Poch">Olivier Poch</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="2012">2012</date>
<idno type="doi">10.1504/IJCBDD.2012.049207</idno>
<idno type="RBID">pubmed:23013652</idno>
<idno type="pmid">23013652</idno>
<idno type="wicri:Area/PubMed/Corpus">000084</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000084</idno>
<idno type="wicri:Area/PubMed/Curation">000084</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000084</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000084</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000084</idno>
<idno type="wicri:Area/Ncbi/Merge">000140</idno>
<idno type="wicri:Area/Ncbi/Curation">000138</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000138</idno>
<idno type="wicri:doubleKey">1756-0756:2012:Devignes M:functional:classification:of</idno>
<idno type="wicri:Area/Main/Merge">001D32</idno>
<idno type="wicri:Area/Main/Curation">001D13</idno>
<idno type="wicri:Area/Main/Exploration">001D13</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis.</title>
<author>
<name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
<affiliation wicri:level="1">
<nlm:affiliation>Lorraine University, Equipe Orpailleur, Campus Scientifique, Vandoeuvre les Nancy cedex, France. devignes@loria.fr</nlm:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>Lorraine University, Equipe Orpailleur, Campus Scientifique, Vandoeuvre les Nancy cedex</wicri:regionArea>
<wicri:noRegion>Vandoeuvre les Nancy cedex</wicri:noRegion>
<wicri:noRegion>Vandoeuvre les Nancy cedex</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Benabderrahmane, Sidahmed" sort="Benabderrahmane, Sidahmed" uniqKey="Benabderrahmane S" first="Sidahmed" last="Benabderrahmane">Sidahmed Benabderrahmane</name>
</author>
<author>
<name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
</author>
<author>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
</author>
<author>
<name sortKey="Poch, Olivier" sort="Poch, Olivier" uniqKey="Poch O" first="Olivier" last="Poch">Olivier Poch</name>
</author>
</analytic>
<series>
<title level="j">International journal of computational biology and drug design</title>
<idno type="ISSN">1756-0756</idno>
<imprint>
<date when="2012" type="published">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Cluster Analysis</term>
<term>Databases, Genetic</term>
<term>Fuzzy Logic</term>
<term>High-Throughput Screening Assays (methods)</term>
<term>Humans</term>
<term>Multigene Family</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de regroupements</term>
<term>Bases de données génétiques</term>
<term>Famille multigénique</term>
<term>Humains</term>
<term>Logique floue</term>
<term>Tests de criblage à haut débit ()</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>High-Throughput Screening Assays</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Cluster Analysis</term>
<term>Databases, Genetic</term>
<term>Fuzzy Logic</term>
<term>Humans</term>
<term>Multigene Family</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de regroupements</term>
<term>Bases de données génétiques</term>
<term>Famille multigénique</term>
<term>Humains</term>
<term>Logique floue</term>
<term>Tests de criblage à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Functional classification aims at grouping genes according to their molecular function or the biological process they participate in. Evaluating the validity of such unsupervised gene classification remains a challenge given the variety of distance measures and classification algorithms that can be used. We evaluate here functional classification of genes with the help of reference sets: KEGG (Kyoto Encyclopaedia of Genes and Genomes) pathways and Pfam clans. These sets represent ground truth for any distance based on GO (Gene Ontology) biological process and molecular function annotations respectively. Overlaps between clusters and reference sets are estimated by the F-score method. We test our previously described IntelliGO semantic distance with hierarchical and fuzzy C-means clustering and we compare results with the state-of-the-art DAVID (Database for Annotation Visualisation and Integrated Discovery) functional classification method. Finally, study of best matching clusters to reference sets leads us to propose a set-difference method for discovering missing information.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
</list>
<tree>
<noCountry>
<name sortKey="Benabderrahmane, Sidahmed" sort="Benabderrahmane, Sidahmed" uniqKey="Benabderrahmane S" first="Sidahmed" last="Benabderrahmane">Sidahmed Benabderrahmane</name>
<name sortKey="Napoli, Amedeo" sort="Napoli, Amedeo" uniqKey="Napoli A" first="Amedeo" last="Napoli">Amedeo Napoli</name>
<name sortKey="Poch, Olivier" sort="Poch, Olivier" uniqKey="Poch O" first="Olivier" last="Poch">Olivier Poch</name>
<name sortKey="Smail Tabbone, Malika" sort="Smail Tabbone, Malika" uniqKey="Smail Tabbone M" first="Malika" last="Smaïl-Tabbone">Malika Smaïl-Tabbone</name>
</noCountry>
<country name="France">
<noRegion>
<name sortKey="Devignes, Marie Dominique" sort="Devignes, Marie Dominique" uniqKey="Devignes M" first="Marie-Dominique" last="Devignes">Marie-Dominique Devignes</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001D13 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001D13 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     pubmed:23013652
   |texte=   Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:23013652" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a InforLorV4 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022